Creating a Cross-Domain Capable ML Pipeline

Introduction

As classifying images into categories is a ubiquitous task occurring in various domains, a need for a machine learning pipeline which can accommodate for new categories is easy to justify. In particular, common general requirements are to filter out low-quality (blurred, low contrast etc.) images, and to speed up the learning of new categories if image quality is sufficient. The task of repurposing a trained model for a different (yet related) task is known as transfer learning. In this blog post, resulting from a joint work with Aigiz Kunafin (Data Scientist at TIMETOACT GROUP Austria), we compare several image classification models from the transfer learning perspective.

Assessment dataset

Our starting point is the task of vehicle classification, based on the well-known kaggle dataset from Tampere University. The images vary greatly in shape and size as well as in content. We first only focus on the two classes, namely cars and trucks. The car class contains images of all different angles and almost all motor car types, from antique cars to racing cars. The truck class comprises nearly all truck types one can imagine. Beside “ordinary” trucks as usually seen on the streets, there are images of military, fire, tow, garbage and tank trucks, to name a few. In addition to cars and trucks, we artificially generate a third class to simulate bad image quality. Therefore, we draw a sample of equal size from each class and randomly augment the images (flipping, blurring, grain, motion blur, darkness). Finally, the dataset contains 971 images of each class.

Finding an appropriate classification algorithm

Our approach is now to implement the architectures below in order to find a classification algorithm with high accuracy for our pipeline. After shaping the data to be used by the models, we save 20% of the data as a test set of unseen images for the comparison of shortlisted models. We then use the remaining 80% of the data for training and validation of the models during their training processes, with a 85/15 train-test-split.

Baseline

As the baseline, we choose a standard PyTorch-based learning pipeline with the ResNet50 model, pre-trained on the ImageNet-1k dataset (also referred to as ILSVRC2012) at resolution 224x224. This model comes in handy for trivial image classification tasks, as it is already familiar with everyday objects, including cars and trucks.

Stepwise Classification Pipeline

With this approach we try to split up the multinomial classification task and use two distinct models for binary classification. A first model is trained to detect if the image belongs to the class of bad image quality. If that is not the case, a second model should detect the class of the image (in this case car or truck). This might turn out beneficial, as each model could be specialized in the respective task. In this approach both models are of the same structure as the baseline model, i.e. a ResNet50 model is used that is already familiar with classes for cars and trucks.

🤗-Transformers Models

“Hugging Face” Transformers is an API that provides state-of-the-art pre-trained models for machine learning as well as a very user-friendly way of implementation. Once the data is in an appropriate folder structure using only one code chunk to specify all necessary model parameters, the API downloads, trains and saves the classification model automatically. In the following we will test the three most liked image classification models on the API: a ViT from Google, a DeiT from Facebook and a DiT from Microsoft. The vision transformer (ViT) from Google is the currently by far most downloaded pre-trained image classification model on the Hugging Face (HF) plattform, being pre-trained on ImageNet-21k and fine-tuned on ImageNet-1k. This model type is in contrast to the above mentioned approaches, as the ResNet50 model is a Convolutional Neural Network (like commonly used for these tasks). Vision transformers, however, might have the capability to overtake CNNs in computer vision tasks in the near future, as several sources speculate and show [1].

Results

Below you can see the classification performance of the different models on a yet unseen test set. As we trained all models on Google Colab, training time might not be fully comparable as processing power depends on current availability.

Overall, the evaluation metrics are very high, which is probably partly due to the fact that the models were already pre-trained to detect several types of cars and trucks. Nonetheless, it turns out that the HF models show best performance not only throughout all evaluation metrics but also in a qualitative analysis. Here, we manually check all mispredicted images to see if the model failed miserably or the wrong prediction was due to an ambiguous test image. Taking this into account, especially the performance of the pre-trained Google ViT Model from HF is outstanding as it made almost no obvious mistakes (also check the confusion matrix of this model on the right). However, training two separate models for each classification step (“stepwise approach”) turned out to worsen performance compared to the baseline.

As the HF ViT model from Google performs best and is also very simple to implement, we will now test the respective training pipeline on datasets of other domains. At first, we try out a dataset with images of bad quality as well as images containing garbage bins or not. Checking the results we can see that the newly trained model also works very well for this dataset, achieving an accuracy of close to 90%. Using other datasets from different domains and with multiple classes, we consistently reach accuracy measures above 90% as you can see in the table below.

Validation of Google ViT model

As the HF Google ViT model performed worse for the garbage bin dataset, we now wanna check if this model nonetheless is superior to the other algorithms. We thus train the models from before once again but now on this garbage bin dataset and compare the results. Here we can see that actually the DeiT model from Facebook performed best with an accuracy of almost 88%. Though, the ViT from Google is nearly as good with an accuracy of 86%. Furthermore, please keep in mind that as the test dataset only contains 108 images, the values should not be compared too strictly.

Conclusion

In the end it is stunning to see how straightforward it is to set up and use a HF Transformers pipeline for image classification while also obtaining excellent performance. Initially used on the classification problem of trucks and cars, we could achieve very high accuracy values by using the very popular HF ViT from Google. This might also be due to the fact that the models were already highly pre-trained for these two classes on ImageNet data and that the training as well as test data was very clean. In addition, it turned out that this pipeline can very easily be fed with new data from other domains while maintaining very good performance. However, depending on the concrete image classification problem, other approaches might perform better.

For sure there are almost endless further possible approaches one could try out. Provided one has appropriately labeled data it might be feasible, for example, to first use a basic classifier to check the image quality and then use an object detection model as a classifier, which tries to find the actual object in the image. Additionally, there are many other pre-trained models and classification techniques one could try out. The vision transformer model from the HF API, however, already sets a very high standard. It promises good performance in different domains while needing little training due to pre-training on the ImageNet dataset. Thus, this pipeline can be used as a first point of call proving to be a fast, simple and accurate image classification technique for a wide range of different use cases.

 

References: 

[1] Cf. https://arxiv.org/pdf/2101.01169.pdf, https://arxiv.org/pdf/2010.11929.pdf

https://becominghuman.ai/transformers-in-vision-e2e87b739feb,

https://towardsdatascience.com/are-transformers-better-than-cnns-at-image-recognition-ced60ccc7c8 

Contact

Christoph Hasenzagl
TIMETOACT GROUP Österreich GmbHContact
Felix KrauseBlog
Blog

Part 1: Detecting Truck Parking Lots on Satellite Images

Real-time truck tracking is crucial in logistics: to enable accurate planning and provide reliable estimation of delivery times, operators build detailed profiles of loading stations, providing expected durations of truck loading and unloading, as well as resting times. Yet, how to derive an exact truck status based on mere GPS signals?

Rinat AbdullinRinat AbdullinBlog
Blog

Part 1: TIMETOACT Logistics Hackathon - Behind the Scenes

A look behind the scenes of our Hackathon on Sustainable Logistic Simulation in May 2022. This was a hybrid event, running on-site in Vienna and remotely. Participants from 12 countries developed smart agents to control cargo delivery truck fleets in a simulated Europe.

Felix KrauseBlog
Blog

Boosting speed of scikit-learn regression algorithms

The purpose of this blog post is to investigate the performance and prediction speed behavior of popular regression algorithms, i.e. models that predict numerical values based on a set of input variables.

Rinat AbdullinRinat AbdullinBlog
Blog

State of Fast Feedback in Data Science Projects

DSML projects can be quite different from the software projects: a lot of R&D in a rapidly evolving landscape, working with data, distributions and probabilities instead of code. However, there is one thing in common: iterative development process matters a lot.

Felix KrauseBlog
Blog

Part 2: Detecting Truck Parking Lots on Satellite Images

In the previous blog post, we created an already pretty powerful image segmentation model in order to detect the shape of truck parking lots on satellite images. However, we will now try to run the code on new hardware and get even better as well as more robust results.

Rinat AbdullinRinat AbdullinBlog
Blog

Machine Learning Pipelines

In this first part, we explain the basics of machine learning pipelines and showcase what they could look like in simple form. Learn about the differences between software development and machine learning as well as which common problems you can tackle with them.

Referenz
Referenz

Automated Planning of Transport Routes

Efficient transport route planning through automation and seamless integration.

Felix KrauseBlog
Blog

License Plate Detection for Precise Car Distance Estimation

When it comes to advanced driver-assistance systems or self-driving cars, one needs to find a way of estimating the distance to other vehicles on the road.

Rinat AbdullinRinat AbdullinBlog
Blog

Strategic Impact of Large Language Models

This blog discusses the rapid advancements in large language models, particularly highlighting the impact of OpenAI's GPT models.

TIMETOACT
Service
Navigationsbild zu Data Science
Service

Data Science, Artificial Intelligence and Machine Learning

For some time, Data Science has been considered the supreme discipline in the recognition of valuable information in large amounts of data. It promises to extract hidden, valuable information from data of any structure.

TIMETOACT
Technologie
Headerbild zu IBM Watson® Knowledge Catalog
Technologie

IBM Watson® Knowledge Catalog/Information Governance Catalog

Today, "IGC" is a proprietary enterprise cataloging and metadata management solution that is the foundation of all an organization's efforts to comply with rules and regulations or document analytical assets.

TIMETOACT
Technologie
Headerbild für IBM SPSS
Technologie

IBM SPSS Modeler

IBM SPSS Modeler is a tool that can be used to model and execute tasks, for example in the field of Data Science and Data Mining, via a graphical user interface.

TIMETOACT
Technologie
Headerbild zu IBM Watson Studio
Technologie

IBM Watson Studio

IBM Watson Studio is an integrated solution for implementing a data science landscape. It helps companies to structure and simplify the process from exploratory analysis to the implementation and operationalisation of the analysis processes.

Rinat AbdullinRinat AbdullinBlog
Blog

Let's build an Enterprise AI Assistant

In the previous blog post we have talked about basic principles of building AI assistants. Let’s take them for a spin with a product case that we’ve worked on: using AI to support enterprise sales pipelines.

Aqeel AlazreeBlog
Blog

Part 1: Data Analysis with ChatGPT

In this new blog series we will give you an overview of how to analyze and visualize data, create code manually and how to make ChatGPT work effectively. Part 1 deals with the following: In the data-driven era, businesses and organizations are constantly seeking ways to extract meaningful insights from their data. One powerful tool that can facilitate this process is ChatGPT, a state-of-the-art natural language processing model developed by OpenAI. In Part 1 pf this blog, we'll explore the proper usage of data analysis with ChatGPT and how it can help you make the most of your data.

Daniel PuchnerBlog
Blog

How we discover and organise domains in an existing product

Software companies and consultants like to flex their Domain Driven Design (DDD) muscles by throwing around terms like Domain, Subdomain and Bounded Context. But what lies behind these buzzwords, and how these apply to customers' diverse environments and needs, are often not as clear. As it turns out it takes a collaborative effort between stakeholders and development team(s) over a longer period of time on a regular basis to get them right.

TIMETOACT
Technologie
Headerbild zu IBM DataStage
Technologie

IBM InfoSphere Information Server

IBM Information Server is a central platform for enterprise-wide information integration. With IBM Information Server, business information can be extracted, consolidated and merged from a wide variety of sources.

TIMETOACT
Technologie
Headerbild zu IBM DB2
Technologie

IBM Db2

The IBM Db2database has been established on the market for many years as the leading data warehouse database in addition to its classic use in operations.

TIMETOACT
Technologie
Headerbild zu IBM Netezza Performance Server
Technologie

IBM Netezza Performance Server

IBM offers Database technology for specific purposes in the form of appliance solutions. In the Data Warehouse environment, the Netezza technology, later marketed under the name "IBM PureData for Analytics", is particularly well known.

TIMETOACT
Technologie
Headerbild zu IBM Decision Optimization
Technologie

Decision Optimization

Mathematical algorithms enable fast and efficient improvement of partially contradictory specifications. As an integral part of the IBM Data Science platform "Cloud Pak for Data" or "IBM Watson Studio", decision optimisation has been decisively expanded and embedded in the Data Science process.

TIMETOACT
Service
Navigationsbild zu Business Intelligence
Service

Business Intelligence

Business Intelligence (BI) is a technology-driven process for analyzing data and presenting usable information. On this basis, sound decisions can be made.

TIMETOACT
Service
Headerbild zu Operationalisierung von Data Science (MLOps)
Service

Operationalization of Data Science (MLOps)

Data and Artificial Intelligence (AI) can support almost any business process based on facts. Many companies are in the phase of professional assessment of the algorithms and technical testing of the respective technologies.

TIMETOACT
Service
Headerbild zu Dashboards und Reports
Service

Dashboards & Reports

The discipline of Business Intelligence provides the necessary means for accessing data. In addition, various methods have developed that help to transport information to the end user through various technologies.

TIMETOACT
Service
Headerbild zu Digitale Planung, Forecasting und Optimierung
Service

Demand Planning, Forecasting and Optimization

After the data has been prepared and visualized via dashboards and reports, the task is now to use the data obtained accordingly. Digital planning, forecasting and optimization describes all the capabilities of an IT-supported solution in the company to support users in digital analysis and planning.

TIMETOACT
Service
Header Konnzeption individueller Business Intelligence Lösungen
Service

Conception of individual Analytics and Big Data solutions

We determine the best approach to develop an individual solution from the professional, role-specific requirements – suitable for the respective situation!

TIMETOACT
Technologie
Headerbild IBM Cloud Pak for Data
Technologie

IBM Cloud Pak for Data

The Cloud Pak for Data acts as a central, modular platform for analytical use cases. It integrates functions for the physical and virtual integration of data into a central data pool - a data lake or a data warehouse, a comprehensive data catalogue and numerous possibilities for (AI) analysis up to the operational use of the same.

TIMETOACT
Technologie
Headerbild zu IBM Cloud Pak for Automation
Technologie

IBM Cloud Pak for Automation

The IBM Cloud Pak for Automation helps you automate manual steps on a uniform platform with standardised interfaces. With the Cloud Pak for Business Automation, the entire life cycle of a document or process can be mapped in the company.

TIMETOACT
Technologie
Haderbild zu IBM Cloud Pak for Application
Technologie

IBM Cloud Pak for Application

The IBM Cloud Pak for Application provides a solid foundation for developing, deploying and modernising cloud-native applications. Since agile working is essential for a faster release cycle, ready-made DevOps processes are used, among other things.

TIMETOACT
Technologie
Headerbild Talend Data Integration
Technologie

Talend Data Integration

Talend Data Integration offers a highly scalable architecture for almost any application and any data source - with well over 900 connectors from cloud solutions like Salesforce to classic on-premises systems.

TIMETOACT
Technologie
Headerbild Talend Application Integration
Technologie

Talend Application Integration / ESB

With Talend Application Integration, you create a service-oriented architecture and connect, broker & manage your services and APIs in real time.

TIMETOACT
Technologie
Headerbild zu Talend Real-Time Big Data Platform
Technologie

Talend Real-Time Big Data Platform

Talend Big Data Platform simplifies complex integrations so you can successfully use Big Data with Apache Spark, Databricks, AWS, IBM Watson, Microsoft Azure, Snowflake, Google Cloud Platform and NoSQL.

TIMETOACT
Technologie
Headerbild zu Talend Data Fabric
Technologie

Talend Data Fabric

The ultimate solution for your data needs – Talend Data Fabric includes everything your (Data Integration) heart desires and serves all integration needs relating to applications, systems and data.

TIMETOACT
Referenz
Referenz

Standardized data management creates basis for reporting

TIMETOACT implements a higher-level data model in a data warehouse for TRUMPF Photonic Components and provides the necessary data integration connection with Talend. With this standardized data management, TRUMPF will receive reports based on reliable data in the future and can also transfer the model to other departments.

Christian FolieBlog
Blog

Designing and Running a Workshop series: The board

In this part, we discuss the basic design of the Miro board, which will aid in conducting the workshops.

Laura GaetanoBlog
Blog

5 lessons from running a (remote) design systems book club

Last year I gifted a design systems book I had been reading to a friend and she suggested starting a mini book club so that she’d have some accountability to finish reading the book. I took her up on the offer and so in late spring, our design systems book club was born. But how can you make the meetings fun and engaging even though you're physically separated? Here are a couple of things I learned from running my very first remote book club with my friend!

Rinat AbdullinRinat AbdullinBlog
Blog

So You are Building an AI Assistant?

So you are building an AI assistant for the business? This is a popular topic in the companies these days. Everybody seems to be doing that. While running AI Research in the last months, I have discovered that many companies in the USA and Europe are building some sort of AI assistant these days, mostly around enterprise workflow automation and knowledge bases. There are common patterns in how such projects work most of the time. So let me tell you a story...

Aqeel AlazreeBlog
Blog

Database Analysis Report

This report comprehensively analyzes the auto parts sales database. The primary focus is understanding sales trends, identifying high-performing products, Analyzing the most profitable products for the upcoming quarter, and evaluating inventory management efficiency.

Rinat AbdullinRinat AbdullinBlog
Blog

Using NLP libraries for post-processing

Learn how to analyse sticky notes in miro from event stormings and how this analysis can be carried out with the help of the spaCy library.

Blog
Blog

SAM Wins First Prize at AIM Hackathon

The winning team of the AIM Hackathon, nexus. Group AI, developed SAM, an AI-powered ESG reporting platform designed to help companies streamline their sustainability compliance.

Felix KrauseBlog
Blog

AIM Hackathon 2024: Sustainability Meets LLMs

Focusing on impactful AI applications, participants addressed key issues like greenwashing detection, ESG report relevance mapping, and compliance with the European Green Deal.

Sebastian BelczykBlog
Blog

Building a micro frontend consuming a design system | Part 3

In this blopgpost, you will learn how to create a react application that consumes a design system.

Jonathan ChannonBlog
Blog

Tracing IO in .NET Core

Learn how we leverage OpenTelemetry for efficient tracing of IO operations in .NET Core applications, enhancing performance and monitoring.

Ian RussellIan RussellBlog
Blog

Creating solutions and projects in VS code

In this post we are going to create a new Solution containing an F# console project and a test project using the dotnet CLI in Visual Studio Code.

Rinat AbdullinRinat AbdullinBlog
Blog

Announcing Domain-Driven Design Exercises

Interested in Domain Driven Design? Then this DDD exercise is perfect for you!

Balazs MolnarBalazs MolnarBlog
Blog

Learn & Share video Obsidian

Knowledge is very powerful. So, finding the right tool to help you gather, structure and access information anywhere and anytime, is rather a necessity than an option. You want to accomplish your tasks better? You want a reliable tool which is easy to use, extendable and adaptable to your personal needs? Today I would like to introduce you to the knowledge management system of my choice: Obsidian.

Rinat AbdullinRinat AbdullinBlog
Blog

LLM Performance Series: Batching

Beginning with the September Trustbit LLM Benchmarks, we are now giving particular focus to a range of enterprise workloads. These encompass the kinds of tasks associated with Large Language Models that are frequently encountered in the context of large-scale business digitalization.

Christian FolieBlog
Blog

Running Hybrid Workshops

When modernizing or building systems, one major challenge is finding out what to build. In Pre-Covid times on-site workshops were a main source to get an idea about ‘the right thing’. But during Covid everybody got used to working remotely, so now the question can be raised: Is it still worth having on-site, physical workshops?

Daniel WellerBlog
Blog

Revolutionizing the Logistics Industry

As the logistics industry becomes increasingly complex, businesses need innovative solutions to manage the challenges of supply chain management, trucking, and delivery. With competitors investing in cutting-edge research and development, it is vital for companies to stay ahead of the curve and embrace the latest technologies to remain competitive. That is why we introduce the TIMETOACT Logistics Simulator Framework, a revolutionary tool for creating a digital twin of your logistics operation.

Ian RussellIan RussellBlog
Blog

So, I wrote a book

Join me as I share the story of writing a book on F#. Discover the challenges, insights, and triumphs along the way.

Christian FolieBlog
Blog

Designing and Running a Workshop series: An outline

Learn how to design and execute impactful workshops. Discover tips, strategies, and a step-by-step outline for a successful workshop series.

Bernhard SchauerBlog
Blog

ADRs as a Tool to Build Empowered Teams

Learn how we use Architecture Decision Records (ADRs) to build empowered, autonomous teams, enhancing decision-making and collaboration.

Laura GaetanoBlog
Blog

Using a Skill/Will matrix for personal career development

Discover how a Skill/Will Matrix helps employees identify strengths and areas for growth, boosting personal and professional development.

Matus ZilinskyBlog
Blog

Creating a Social Media Posts Generator Website with ChatGPT

Using the GPT-3-turbo and DALL-E models in Node.js to create a social post generator for a fictional product can be really helpful. The author uses ChatGPT to create an API that utilizes the openai library for Node.js., a Vue component with an input for the title and message of the post. This article provides step-by-step instructions for setting up the project and includes links to the code repository.

Sebastian BelczykBlog
Blog

Building A Shell Application for Micro Frontends | Part 4

We already have a design system, several micro frontends consuming this design system, and now we need a shell application that imports micro frontends and displays them.

Aqeel AlazreeBlog
Blog

Part 3: How to Analyze a Database File with GPT-3.5

In this blog, we'll explore the proper usage of data analysis with ChatGPT and how you can analyze and visualize data from a SQLite database to help you make the most of your data.

Martin WarnungMartin WarnungBlog
Blog

Common Mistakes in the Development of AI Assistants

How fortunate that people make mistakes: because we can learn from them and improve. We have closely observed how companies around the world have implemented AI assistants in recent months and have, unfortunately, often seen them fail. We would like to share with you how these failures occurred and what can be learned from them for future projects: So that AI assistants can be implemented more successfully in the future!

Jörg EgretzbergerJörg EgretzbergerBlog
Blog

8 tips for developing AI assistants

AI assistants for businesses are hype, and many teams were already eagerly and enthusiastically working on their implementation. Unfortunately, however, we have seen that many teams we have observed in Europe and the US have failed at the task. Read about our 8 most valuable tips, so that you will succeed.

Peter SzarvasPeter SzarvasBlog
Blog

Why Was Our Project Successful: Coincidence or Blueprint?

“The project exceeded all expectations,” is one among our favourite samples of the very positive feedback from our client. Here's how we did it!

Christian FolieBlog
Blog

The Power of Event Sourcing

This is how we used Event Sourcing to maintain flexibility, handle changes, and ensure efficient error resolution in application development.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 7

Explore LINQ and query expressions in F#. Simplify data manipulation and enhance your functional programming skills with this guide.

Rinat AbdullinRinat AbdullinBlog
Blog

Consistency and Aggregates in Event Sourcing

Learn how we ensures data consistency in event sourcing with effective use of aggregates, enhancing system reliability and performance.

Blog
Blog

My Workflows During the Quarantine

The current situation has deeply affected our daily lives. However, in retrospect, it had a surprisingly small impact on how we get work done at TIMETOACT GROUP Austria.

Rinat AbdullinRinat AbdullinBlog
Blog

Learning + Sharing at TIMETOACT GROUP Austria

Discover how we fosters continuous learning and sharing among employees, encouraging growth and collaboration through dedicated time for skill development.

Laura GaetanoBlog
Blog

My Weekly Shutdown Routine

Discover my weekly shutdown routine to enhance productivity and start each week fresh. Learn effective techniques for reflection and organization.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 8

Discover Units of Measure and Type Providers in F#. Enhance data management and type safety in your applications with these powerful tools.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 9

Explore Active Patterns and Computation Expressions in F#. Enhance code clarity and functionality with these advanced techniques.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 12

Explore reflection and meta-programming in F#. Learn how to dynamically manipulate code and enhance flexibility with advanced techniques.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 6

Learn error handling in F# with option types. Improve code reliability using F#'s powerful error-handling techniques.

Rinat AbdullinRinat AbdullinBlog
Blog

Innovation Incubator at TIMETOACT GROUP Austria

Discover how our Innovation Incubator empowers teams to innovate with collaborative, week-long experiments, driving company-wide creativity and progress.

Rinat AbdullinRinat AbdullinBlog
Blog

Innovation Incubator Round 1

Team experiments with new technologies and collaborative problem-solving: This was our first round of the Innovation Incubator.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 10

Discover Agents and Mailboxes in F#. Build responsive applications using these powerful concurrency tools in functional programming.

Rinat AbdullinRinat AbdullinBlog
Blog

Process Pipelines

Discover how process pipelines break down complex tasks into manageable steps, optimizing workflows and improving efficiency using Kanban boards.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 11

Learn type inference and generic functions in F#. Boost efficiency and flexibility in your code with these essential programming concepts.

Blog
Blog

ChatGPT & Co: LLM Benchmarks for September

Find out which large language models outperformed in the September 2024 benchmarks. Stay informed on the latest AI developments and performance metrics.

Nina DemuthBlog
Blog

They promised it would be the next big thing!

Haven’t we all been there? We have all been promised by teachers, colleagues or public speakers that this or that was about to be the next big thing in tech that would change the world as we know it.

Jonathan ChannonBlog
Blog

Understanding F# Type Aliases

In this post, we discuss the difference between F# types and aliases that from a glance may appear to be the same thing.

Rinat AbdullinRinat AbdullinBlog
Blog

Open-sourcing 4 solutions from the Enterprise RAG Challenge

Our RAG competition is a friendly challenge different AI Assistants competed in answering questions based on the annual reports of public companies.

Bernhard SchauerBlog
Blog

Isolating legacy code with ArchUnit tests

Clear boundaries in code are important ... and hard. ArchUnit allows you to capture the structure your team agreed on in tests.

Jonathan ChannonBlog
Blog

Understanding F# applicatives and custom operators

In this post, Jonathan Channon, a newcomer to F#, discusses how he learnt about a slightly more advanced functional concept — Applicatives.

Nina DemuthBlog
Blog

From the idea to the product: The genesis of Skwill

We strongly believe in the benefits of continuous learning at work; this has led us to developing products that we also enjoy using ourselves. Meet Skwill.

Blog
Blog

Second Place - AIM Hackathon 2024: Trustpilot for ESG

The NightWalkers designed a scalable tool that assigns trustworthiness scores based on various types of greenwashing indicators, including unsupported claims and inaccurate data.

Blog
Blog

Third Place - AIM Hackathon 2024: The Venturers

ESG reports are often filled with vague statements, obscuring key facts investors need. This team created an AI prototype that analyzes these reports sentence-by-sentence, categorizing content to produce a "relevance map".